Refactor serializer: rearrange code for clarity and introspection #1148

dmsnell · 2017-06-12T20:55:15Z

@see #948 for background exploration

These changes are intended to make the flow of transformation from
in-memory block data to serialized post_content clearer. Additionally
they are intended to ease testing and create points for easy trapping
and transformation of the data through the pipeline.

As I'm working in the serializer file to make harmonious updates with the parser changes I'm finding it unclear what is and what should be going on in serialization. Hopefully this PR will serve as a base for future work.

It makes subjective calls on style; I know. Please let me know if you don't like it. ~~Performance shouldn't be an issue and if you think it will please respond with data backing that up.~~ Unlike in #948 I tried to give a reasonable 👁 to performance in this incarnation and I don't anticipate it being reasonably less-performant than the current implementation in production.

My plans are to continue to augment this (and update the tests) so that we can have more control over the serialization process. This additional control might give us more ability to automatically detect if posts have been altered out of Gutenberg and if so, if they have been altered in a deleterious manner.

nylen · 2017-06-12T22:13:38Z

This additional control might give us more ability to automatically detect if posts have been altered out of Gutenberg

As discussed elsewhere (#844 for example), this is pretty worrisome. We've chosen a data storage mechanism (post_content) which in software changes is ancient; it's our responsibility to recognize that it has other uses besides our project.

and if so, if they have been altered in a deleterious manner.

This is more like it. We should be validating inputs, not detecting changes.

nylen · 2017-06-12T22:36:52Z

blocks/api/serializer.js

-		Object.keys( realAttributes ),
-		Object.keys( expectedAttributes )
-	);
+export function attributesToSave( allAttributes, fromContent ) {


I think getCommentAttributes was a better name. I would say the signature of this function should be:

function getCommentAttributes( allAttributes, attributesFromContent ) {

that's fine with me. I found it confusing (which is why I changed it) but I don't care too much

nylen · 2017-06-12T22:38:06Z

blocks/api/serializer.js


-		return memo + `${ key }="${ value }" `;
-	}, '' );
+	const transform = key => escapeDoubleQuotes( allAttributes[ key ] );


Transforming should not really be the responsibility of the function that goes from all attributes to attributes saved in comments. This should be done during the serialization-to-string step.

fair point. the reason I did this was to avoid iterating twice. granted, that's an optimization

it did however seem reasonable to do here since this function was taking "attributes from a block" and returning "attributes that need to be saved in the comment header." if we don't transform here, we have a function which returns a set of attributes in a state that should never exist: "the group of attributes which need to serialize in the comment header but which can't be saved as-is because of potential serialization bugs"

thoughts?

nylen · 2017-06-12T22:40:24Z

blocks/api/serializer.js

+	// Iterate over attributes and produce the set to save
+	return reduce(
+		Object.keys( allAttributes ),
+		( toSave, key ) => Object.assign( toSave, isValid( key ) && { [ key ]: transform( key ) } ),


This feels a bit too magical. What does Object.assign( toSave, false ) actually do?

Combined with the above comment, how about filter( allAttributes, ( value, key ) => isValid( key ) ) instead?

Also, do we really need isValid at all? Isn't it enough to say simply !! attributesFromContent[ key ] ?

This feels a bit too magical.

I can see where it feels magical, but it's just the behavior of Object.assign() by specification.

how about filter

this seems very reasonable. I'll update that.

Also, do we really need isValid at all? Isn't it enough to say simply !! attributesFromContent[ key ] ?

well if that's all we said we'd be wrong in the case where allAttributes[ key ] is undefined and it doesn't exist in attributesFromContent. trying to pull out the actual operations that are happening to determine if a key is valid is part of why I created isValid() since before it was spread across different lines.

dmsnell · 2017-06-13T14:35:08Z

This is more like it. We should be validating inputs, not detecting changes.

@nylen this is the whole point. the goal isn't to slap people on the write for using an external editor, but it's to determine if any external edits caused any change in the structural content of the block and if so, present the user with some indication that decay has occurred.

dmsnell · 2017-06-13T19:55:07Z

blocks/api/post.pegjs

  { return keyValue( name, value ) }
-  / name:HTML_Attribute_Name _* "=" _* "'" value:$(("\\'" . / !"'" .)*) "'"
+  / name:HTML_Attribute_Name _* "=" _* "'" value:$(("\\" "'" . / !"'" .)*) "'"


More on these changes later…

youknowriad · 2017-06-13T20:06:38Z

blocks/api/serializer.js

-		return memo + `${ key }="${ value }" `;
-	}, '' );
+			return ! ( contentValue !== undefined || allValue === undefined )
+				? Object.assign( toSave, { [ key ]: escapeDoubleQuotes( allValue ) } )


It feels a bit weird to me that we're encoding the attributes values in this function but we generate the key="value" in the other function. Serialization of the comment attributes is split between two methods.

I'd think we should avoid encoding here, or generate the complete string here because encoding the values without generating a string makes no sense to me.

rearranged in fdd4aa72

These changes are intended to make the flow of transformation from in-memory block data to serialized `post_content` clearer. Additionally they are intended to ease testing and create points for easy trapping and transformation of the data through the pipeline.

westonruter · 2017-06-15T12:09:23Z

blocks/api/post.pegjs

@@ -1,8 +1,14 @@
 {

+function untransformValue( value ) {
+	return 'string' === typeof value
+		? value.replace( /\\-/g, '-' )


Is this right? What if you have the string "\\\\-"?

this needs to pair with the serializer to make sure we don't get here. we have have to face escaping in some manner. doing it here I think is the easiest way to accomplish this: "You must escape hyphens. You need not escape anything else."

Also, see #1088 for background on the issues with a double-hyphen in an HTML comment

~~I think this means we should escape \ as well?~~

westonruter · 2017-06-15T12:09:52Z

blocks/api/serializer.js

@@ -36,35 +36,94 @@ export function getSaveContent( save, attributes ) {
 	return wp.element.renderToString( rawContent );
 }

+const escapeDoubleQuotes = value => value.replace( /"/g, '\"' );
+const replaceHyphens = value => value.replace( /-/g, '\\-' );


Maybe it should be escapeHyphens for consistency?

westonruter · 2017-06-15T12:10:47Z

blocks/api/serializer.js

+ * @returns {*}       transformed value
+ */
+const serializeValue = value =>
+	'string' === typeof value


What if it is an array or object that contains strings?

they shouldn't be. we should already be JSON-serializing data if not a number or string.

this will change if we stop using name="value" pairings, which is being worked on in another PR at the moment (not yet published)

youknowriad

LGTM 👍

aduth · 2017-06-15T13:31:22Z

blocks/test/full-content.js

@@ -127,7 +127,7 @@ describe( 'full post content fixture', () => {
 				}
 			}

-			expect( serializedActual ).to.eql( serializedExpected );
+			expect( serializedActual.trim() ).to.eql( serializedExpected.trim() );


Do we want (or need) to be forgiving with leading/trailing whitespace?

In my opinion we shouldn't be testing for the specific presence of absence of it.

dmsnell added [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f [Status] In Progress Tracking issues with work in progress labels Jun 12, 2017

dmsnell requested review from youknowriad, mtias and aduth June 12, 2017 20:55

nylen reviewed Jun 12, 2017

View reviewed changes

dmsnell force-pushed the refactor/serializer branch from b4317d1 to 1ad4c77 Compare June 13, 2017 19:54

dmsnell commented Jun 13, 2017

View reviewed changes

youknowriad reviewed Jun 13, 2017

View reviewed changes

dmsnell force-pushed the refactor/serializer branch 2 times, most recently from eb87c72 to a0a8351 Compare June 14, 2017 23:30

dmsnell added 8 commits June 15, 2017 13:14

Updates from feedback

1166e68

inline functions

d495cf7

add explanatory comment

a05d749

styling and comment

24fdbe2

moar updates

7e8c2c9

feedback updates

7d82605

Remove import of some

f97ee6a

dmsnell force-pushed the refactor/serializer branch from 01d9635 to f97ee6a Compare June 15, 2017 11:14

dmsnell added 2 commits June 15, 2017 13:35

Escape hyphens in block comments to prevent breaking parse

f97431f

Unescape hyphens in block comment headers

05b15a0

westonruter reviewed Jun 15, 2017

View reviewed changes

dmsnell added 2 commits June 15, 2017 14:45

Update tests

065141e

Rename replaceHyphens -> escapeHyphens

470f1b3

youknowriad approved these changes Jun 15, 2017

View reviewed changes

dmsnell merged commit b2c83a7 into master Jun 15, 2017

dmsnell deleted the refactor/serializer branch June 15, 2017 13:16

aduth reviewed Jun 15, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor serializer: rearrange code for clarity and introspection #1148

Refactor serializer: rearrange code for clarity and introspection #1148

dmsnell commented Jun 12, 2017 •

edited

Loading

nylen commented Jun 12, 2017

nylen Jun 12, 2017

dmsnell Jun 13, 2017

nylen Jun 12, 2017

dmsnell Jun 13, 2017

nylen Jun 12, 2017

nylen Jun 12, 2017

dmsnell Jun 13, 2017

dmsnell commented Jun 13, 2017

dmsnell Jun 13, 2017

youknowriad Jun 13, 2017 •

edited

Loading

dmsnell Jun 13, 2017

westonruter Jun 15, 2017

dmsnell Jun 15, 2017

dmsnell Jun 15, 2017

youknowriad Jun 15, 2017 •

edited

Loading

westonruter Jun 15, 2017

westonruter Jun 15, 2017

dmsnell Jun 15, 2017

youknowriad left a comment

aduth Jun 15, 2017

dmsnell Jun 15, 2017

Refactor serializer: rearrange code for clarity and introspection #1148

Refactor serializer: rearrange code for clarity and introspection #1148

Conversation

dmsnell commented Jun 12, 2017 • edited Loading

nylen commented Jun 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmsnell commented Jun 13, 2017

Choose a reason for hiding this comment

youknowriad Jun 13, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youknowriad Jun 15, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youknowriad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmsnell commented Jun 12, 2017 •

edited

Loading

youknowriad Jun 13, 2017 •

edited

Loading

youknowriad Jun 15, 2017 •

edited

Loading